Skip to content

Conversation

@eliteprox
Copy link
Collaborator

@eliteprox eliteprox commented Sep 15, 2025

This pull request introduces a comprehensive example for model loading and loading overlays, adds new configuration and handler support for warmup routines, and improves the codebase with better queue management and testing configuration. The most important changes are grouped below:

New Example and Documentation Updates

  • Added a new examples/loading_overlay.py example demonstrating non-blocking model loading, health state transitions, animated loading overlays, and real-time parameter updates. This helps users understand how to implement loading overlays and manage model load states.
  • Updated the README.md to reference the new example files for green tint and model loading overlays, improving documentation accuracy.

Warmup Handler and Configuration Support

  • Added WarmupConfig and WarmupMode to the public API, and introduced a warmup decorator in pytrickle/decorators.py for registering warmup handlers. This enables explicit warmup routines for models and resources. [1] [2]
  • Enhanced FrameProcessor to support warmup configuration, warmup state management, and a thread-safe model loading process that includes warmup execution. This ensures models and overlays are properly initialized before serving requests.

Queue Management and Testing Improvements

  • Added a clear_input_queues method to TrickleClient to clear stale video and audio frames when parameters change, preventing outdated frames from being processed with new settings.
  • Introduced a default pytest configuration in .vscode/settings.json for easier test discovery and execution.

API and Import Cleanups

  • Refactored imports and type definitions: moved ErrorCallback to pytrickle/base.py, updated related imports across files, and exposed new API functions such as build_loading_overlay_frame. [1] [2] [3] [4] [5]

Development Environment

  • Added a new VSCode launch configuration for running the loading overlay example with orchestrator environment variables, improving developer workflow. (.vscode/launch.json)

@eliteprox eliteprox marked this pull request as ready for review September 15, 2025 22:29
@eliteprox eliteprox changed the title load_model sync, attach frame processor to server health state Implement non-blocking model preloading with accurate health state management Sep 20, 2025
@eliteprox eliteprox changed the title Implement non-blocking model preloading with accurate health state management Implement non-blocking model loading with accurate health state management Sep 20, 2025
@eliteprox eliteprox marked this pull request as draft September 25, 2025 19:57
@eliteprox eliteprox self-assigned this Oct 1, 2025
@eliteprox eliteprox marked this pull request as ready for review October 27, 2025 17:32
@eliteprox eliteprox force-pushed the fix/load-model-sync branch from 64f3073 to 6962fc0 Compare October 27, 2025 18:55
@eliteprox eliteprox requested a review from pschroedl October 28, 2025 21:51
@eliteprox eliteprox requested a review from JJassonn69 October 31, 2025 16:46
@eliteprox eliteprox changed the title Implement non-blocking model loading with accurate health state management Implement non-blocking model loading with health state transition Nov 3, 2025
@eliteprox eliteprox force-pushed the fix/load-model-sync branch 3 times, most recently from d00deef to 15ea366 Compare November 5, 2025 23:55
@eliteprox eliteprox force-pushed the fix/load-model-sync branch from 15ea366 to 2b17af1 Compare November 6, 2025 00:09
return frame

# Create loading overlay frame using utility (RGB format to match tensor expectations)
loading_frame = create_loading_frame(width, height, loading_message, frame_counter, color_format="RGB")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do this internally so that if the pipeline is not ready yet it displays the default overlay, if we want the user to modify the loading we can add maybe modify_loading_frames that the user can import and modify.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, will move this into frame_processor

@JJassonn69
Copy link
Contributor

Tested the changes with the decorated example/grayscale_chipmunk_example.py and it works without hiccups. No need to invoke await processor._frame_processor.load_model() internal function anymore.

…ing overlay logic in video processing

- move ErrorCallback to frame_processor to fix editable install
…rocessing

- Implemented clear_input_queues method in TrickleClient to clear pending frames from video and audio input queues.
- Updated StreamServer to call clear_input_queues before updating parameters, ensuring stale frames are not processed.
- Enhanced logging for model loading and warmup sequences in FrameProcessor
…lated components

- Added a warmup decorator for handler functions in decorators.py.
- Enhanced FrameProcessor to automatically trigger warmup after model loading.
- Introduced WarmupProtocol for defining warmup handler signatures in registry.py.
- Updated StreamProcessor to support warmup handlers and manage warmup sequences.
- Improved logging for warmup processes to ensure better traceability.
@eliteprox eliteprox requested a review from JJassonn69 November 7, 2025 17:15

# Handle warmup sentinel message (extract it with pop)
warmup_params = params.pop("warmup", None)
if warmup_params is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this trigger on all model loads as well?

to the user's param_updater callback.
"""
# Handle model loading sentinel message (extract it with pop)
if params.pop("load_model", False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the workflow here? Is this what would be an indication the Client thinks this will require a significant update to the pipeline that will take some time?

If it is, what do you think of using reload_pipeline? This could possibly be loading a lora, changing pipeline configs only available at load time, etc.

Copy link
Collaborator Author

@eliteprox eliteprox Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that idea of reload_pipeline. We can implement it as a managed state control OK/IDLE -> LOADING -> OK/IDLE which runs load_model and warmup via prompt update. pytrickle is sort of "model unaware" so I don't think we should do anything like force free gpu memory. IMHO it should be up to the consuming app to track and manage state of their model and to handle callbacks to load_model when their model is already loaded into memory appropriately. wdyt? @ad-astra-video

ComfyStream is unique in this because it automatically loads/unloads VRAM based on workflow changes.

from typing import Optional, Dict, Any

from . import ErrorCallback
from .base import ErrorCallback
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document in the PR why the ErrorCallback had to move. How you found the issue and tested it would help too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some testing and found this change unnecessary, so it's been reverted back in 5a1cb95

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logger.info("State transition: LOADING → IDLE")
self.set_state(PipelineState.IDLE)
else:
logger.info(f"State already {self._state.name}, not transitioning")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like should be debug level but will defer to your judgement if provides value in development with experience working this into comfystream.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this is an excess debug log

}, status=400)

# Clear input queues before updating to avoid processing stale frames
await self.current_client.clear_input_queues()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, I think this should be a setting in the update params to allow for turning this off. I don't know that every update to params we should drop all frames in the input queue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants